Overview

Dataset Statistics

Number of Variables 20
Number of Rows 98913
Missing Cells 402774
Missing Cells (%) 20.4%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 30.8 MB
Average Row Size in Memory 326.1 B
Variable Types
  • Categorical: 10
  • Numerical: 10

Dataset Insights

_wgustm and _windchillm have similar distributions Similar Distribution
_heatindexm has 69802 (70.57%) missing values Missing
_precipm has 98913 (100.0%) missing values Missing
_vism has 4416 (4.46%) missing values Missing
_wdird has 14380 (14.54%) missing values Missing
_wdire has 14380 (14.54%) missing values Missing
_wgustm has 97850 (98.93%) missing values Missing
_windchillm has 98340 (99.42%) missing values Missing
_wspdm has 2353 (2.38%) missing values Missing
_pressurem is skewed Skewed
_vism is skewed Skewed
_wdird is skewed Skewed
_wgustm is skewed Skewed
_windchillm is skewed Skewed
_wspdm is skewed Skewed
datetime_utc has a high cardinality: 98913 distinct values High Cardinality
datetime_utc has constant length 14 Constant Length
_fog has constant length 1 Constant Length
_hail has constant length 1 Constant Length
_rain has constant length 1 Constant Length
_snow has constant length 1 Constant Length
_thunder has constant length 1 Constant Length
_tornado has constant length 1 Constant Length
datetime_utc has all distinct values Unique
_precipm has all distinct values Unique
_wdird has 17330 (17.52%) zeros Zeros
_wspdm has 28884 (29.2%) zeros Zeros
  • 1
  • 2
  • 3

Variables


datetime_utc

categorical

Approximate Distinct Count 98913
Approximate Unique (%) 100.0%
Missing 0
Missing (%) 0.0%
Memory Size 7.5 MB

Length

Mean 14
Standard Deviation 0
Median 14
Minimum 14
Maximum 14

Sample

1st row 19961101-11:00
2nd row 19961101-12:00
3rd row 19961101-13:00
4th row 19961101-14:00
5th row 19961101-16:00

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 98913
Decimal Number 1186956
  • datetime_utc contains many words: 98913 words
  • datetime_utc has words of constant length

_conds

categorical

Approximate Distinct Count 39
Approximate Unique (%) 0.0%
Missing 68
Missing (%) 0.1%
Memory Size 6.7 MB
  • The largest value (Haze) is over 2.31 times larger than the second largest value (Smoke)

Length

Mean 5.7609
Standard Deviation 3.6366
Median 4
Minimum 3
Maximum 29

Sample

1st row Smoke
2nd row Smoke
3rd row Smoke
4th row Smoke
5th row Smoke

Letter

Count 552184
Lowercase Letter 437580
Space Separator 17248
Uppercase Letter 114604
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Haze, Smoke) take over 50.0%
  • The largest value (haze) is over 2.31 times larger than the second largest value (smoke)

_dewptm

numerical

Approximate Distinct Count 51
Approximate Unique (%) 0.1%
Missing 619
Missing (%) 0.6%
Infinite 0
Infinite (%) 0.0%
Memory Size 1.5 MB
Mean 15.8171
Minimum -24
Maximum 75
Zeros 164
Zeros (%) 0.2%
Negatives 243
Negatives (%) 0.2%
  • _dewptm is skewed left (γ1 = -0.0116)

Quantile Statistics

Minimum -24
5-th Percentile 5
Q1 10
Median 15
Q3 22
95-th Percentile 26
Maximum 75
Range 99
IQR 12

Descriptive Statistics

Mean 15.8171
Standard Deviation 7.0969
Variance 50.3662
Sum 1.5547e+06
Skewness -0.01162
Kurtosis -0.9975
Coefficient of Variation 0.4487
  • _dewptm is not normally distributed (p-value 0.001386768388619009)
  • _dewptm has 10 outliers

_fog

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 6.2 MB
  • The largest value (0) is over 13.9 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 98913
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 13.9 times larger than the second largest value (1)
  • _fog has words of constant length

_hail

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 6.2 MB
  • The largest value (0) is over 7607.69 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 98913
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 7607.69 times larger than the second largest value (1)
  • _hail has words of constant length

_heatindexm

numerical

Approximate Distinct Count 193
Approximate Unique (%) 0.7%
Missing 69802
Missing (%) 70.6%
Infinite 0
Infinite (%) 0.0%
Memory Size 454.9 KB
Mean 35.6596
Minimum 26.8
Maximum 73.6
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • _heatindexm is skewed right (γ1 = 0.4315)

Quantile Statistics

Minimum 26.8
5-th Percentile 28.3
Q1 31.7
Median 35.1
Q3 39.2
95-th Percentile 44.4
Maximum 73.6
Range 46.8
IQR 7.5

Descriptive Statistics

Mean 35.6596
Standard Deviation 5.0136
Variance 25.136
Sum 1.0381e+06
Skewness 0.4315
Kurtosis -0.318
Coefficient of Variation 0.1406
  • _heatindexm has 58 outliers

_hum

numerical

Approximate Distinct Count 100
Approximate Unique (%) 0.1%
Missing 753
Missing (%) 0.8%
Infinite 0
Infinite (%) 0.0%
Memory Size 1.5 MB
Mean 57.7744
Minimum 4
Maximum 243
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • _hum is skewed left (γ1 = -0.0664)

Quantile Statistics

Minimum 4
5-th Percentile 19
Q1 39
Median 58
Q3 77
95-th Percentile 94
Maximum 243
Range 239
IQR 38

Descriptive Statistics

Mean 57.7744
Standard Deviation 23.7545
Variance 564.2745
Sum 5.6711e+06
Skewness -0.06643
Kurtosis -0.9419
Coefficient of Variation 0.4112
  • _hum is not normally distributed (p-value 0.000221255211050295)
  • _hum has 3 outliers

_precipm

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 6.4 MB

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row nan
2nd row nan
3rd row nan
4th row nan
5th row nan

Letter

Count 296739
Lowercase Letter 296739
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 0
  • _precipm has words of constant length

_pressurem

numerical

Approximate Distinct Count 139
Approximate Unique (%) 0.1%
Missing 231
Missing (%) 0.2%
Infinite 0
Infinite (%) 0.0%
Memory Size 1.5 MB
Mean 1951.8614
Minimum -9999
Maximum 1.0106e+08
Zeros 3
Zeros (%) 0.0%
Negatives 749
Negatives (%) 0.8%
  • _pressurem is skewed right (γ1 = 314.1274)

Quantile Statistics

Minimum -9999
5-th Percentile 996
Q1 1002
Median 1008
Q3 1014
95-th Percentile 1019
Maximum 1.0106e+08
Range 1.0107e+08
IQR 12

Descriptive Statistics

Mean 1951.8614
Standard Deviation 321710.1223
Variance 1.035e+11
Sum 1.9261e+08
Skewness 314.1274
Kurtosis 98675.0104
Coefficient of Variation 164.8222
  • _pressurem is not normally distributed (p-value 4.22651409364698e-25)
  • _pressurem has 923 outliers

_rain

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 6.2 MB
  • The largest value (0) is over 36.74 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 98913
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 36.74 times larger than the second largest value (1)
  • _rain has words of constant length

_snow

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 6.2 MB
  • The largest value (0) is over 98912.0 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 98913
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 98912.0 times larger than the second largest value (1)
  • _snow has words of constant length

_tempm

numerical

Approximate Distinct Count 50
Approximate Unique (%) 0.1%
Missing 669
Missing (%) 0.7%
Infinite 0
Infinite (%) 0.0%
Memory Size 1.5 MB
Mean 25.5784
Minimum 1
Maximum 90
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • _tempm is skewed left (γ1 = -0.3456)

Quantile Statistics

Minimum 1
5-th Percentile 10
Q1 19
Median 27
Q3 32
95-th Percentile 38
Maximum 90
Range 89
IQR 13

Descriptive Statistics

Mean 25.5784
Standard Deviation 8.4647
Variance 71.6517
Sum 2.5129e+06
Skewness -0.3456
Kurtosis -0.5848
Coefficient of Variation 0.3309
  • _tempm is not normally distributed (p-value 1.8211340675769962e-05)
  • _tempm has 4 outliers

_thunder

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 6.2 MB
  • The largest value (0) is over 105.82 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 98913
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 105.82 times larger than the second largest value (1)
  • _thunder has words of constant length

_tornado

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 6.2 MB
  • The largest value (0) is over 49455.5 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 98913
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 49455.5 times larger than the second largest value (1)
  • _tornado has words of constant length

_vism

numerical

Approximate Distinct Count 48
Approximate Unique (%) 0.1%
Missing 4416
Missing (%) 4.5%
Infinite 0
Infinite (%) 0.0%
Memory Size 1.4 MB
Mean 2.4199
Minimum 0
Maximum 6436
Zeros 123
Zeros (%) 0.1%
Negatives 0
Negatives (%) 0.0%
  • _vism is skewed right (γ1 = 305.5707)

Quantile Statistics

Minimum 0
5-th Percentile 0.6
Q1 1.5
Median 2
Q3 3
95-th Percentile 5
Maximum 6436
Range 6436
IQR 1.5

Descriptive Statistics

Mean 2.4199
Standard Deviation 20.9707
Variance 439.7693
Sum 228671.83
Skewness 305.5707
Kurtosis 93742.8754
Coefficient of Variation 8.666
  • _vism is not normally distributed (p-value 4.226514095819465e-25)
  • _vism has 1503 outliers

_wdird

numerical

Approximate Distinct Count 63
Approximate Unique (%) 0.1%
Missing 14380
Missing (%) 14.5%
Infinite 0
Infinite (%) 0.0%
Memory Size 1.3 MB
Mean 162.551
Minimum 0
Maximum 960
Zeros 17330
Zeros (%) 17.5%
Negatives 0
Negatives (%) 0.0%
  • _wdird is skewed left (γ1 = -0.0582)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 50
Median 150
Q3 270
95-th Percentile 320
Maximum 960
Range 960
IQR 220

Descriptive Statistics

Mean 162.551
Standard Deviation 120.0199
Variance 14404.7737
Sum 1.3741e+07
Skewness -0.05824
Kurtosis -1.465
Coefficient of Variation 0.7384
  • _wdird is not normally distributed (p-value 6.48831941688877e-13)
  • _wdird has 4 outliers

_wdire

categorical

Approximate Distinct Count 17
Approximate Unique (%) 0.0%
Missing 14380
Missing (%) 14.5%
Memory Size 5.5 MB
  • The largest value (North) is over 1.65 times larger than the second largest value (West)

Length

Mean 3.5124
Standard Deviation 1.0544
Median 3
Minimum 2
Maximum 8

Sample

1st row West
2nd row North
3rd row North
4th row North
5th row North

Letter

Count 296913
Lowercase Letter 137105
Space Separator 0
Uppercase Letter 159808
Dash Punctuation 0
Decimal Number 0
  • The largest value (north) is over 1.65 times larger than the second largest value (west)

_wgustm

numerical

Approximate Distinct Count 22
Approximate Unique (%) 2.1%
Missing 97850
Missing (%) 98.9%
Infinite 0
Infinite (%) 0.0%
Memory Size 16.6 KB
Mean 37.7045
Minimum 25.9
Maximum 92.6
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • _wgustm is skewed right (γ1 = 2.0416)

Quantile Statistics

Minimum 25.9
5-th Percentile 27.8
Q1 33.3
Median 37
Q3 40.7
95-th Percentile 48.2
Maximum 92.6
Range 66.7
IQR 7.4

Descriptive Statistics

Mean 37.7045
Standard Deviation 6.8411
Variance 46.8002
Sum 40079.9
Skewness 2.0416
Kurtosis 8.8819
Coefficient of Variation 0.1814
  • _wgustm is not normally distributed (p-value 1.6835424847374792e-17)
  • _wgustm has 45 outliers

_windchillm

numerical

Approximate Distinct Count 20
Approximate Unique (%) 3.5%
Missing 98340
Missing (%) 99.4%
Infinite 0
Infinite (%) 0.0%
Memory Size 9.0 KB
Mean 5.7082
Minimum 2.1
Maximum 7.3
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • _windchillm is skewed left (γ1 = -0.5523)

Quantile Statistics

Minimum 2.1
5-th Percentile 3.54
Q1 4.9
Median 6.1
Q3 6.8
95-th Percentile 7.3
Maximum 7.3
Range 5.2
IQR 1.9

Descriptive Statistics

Mean 5.7082
Standard Deviation 1.206
Variance 1.4544
Sum 3270.8
Skewness -0.5523
Kurtosis -0.5409
Coefficient of Variation 0.2113
  • _windchillm is not normally distributed (p-value 8.163253145203484e-08)

_wspdm

numerical

Approximate Distinct Count 89
Approximate Unique (%) 0.1%
Missing 2353
Missing (%) 2.4%
Infinite 0
Infinite (%) 0.0%
Memory Size 1.5 MB
Mean 7.6859
Minimum 0
Maximum 1514.9
Zeros 28884
Zeros (%) 29.2%
Negatives 0
Negatives (%) 0.0%
  • _wspdm is skewed right (γ1 = 59.4215)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 7.4
Q3 11.1
95-th Percentile 22.2
Maximum 1514.9
Range 1514.9
IQR 11.1

Descriptive Statistics

Mean 7.6859
Standard Deviation 11.996
Variance 143.9032
Sum 742146.1
Skewness 59.4215
Kurtosis 6484.4248
Coefficient of Variation 1.5608
  • _wspdm is not normally distributed (p-value 4.230637009039231e-25)
  • _wspdm has 1411 outliers

Interactions

Correlations

Missing Values